China’s GDP forecasting using Long Short Term Memory Recurrent Neural Network and Hidden Markov Model

This paper presents a Long Short Term Memory Recurrent Neural Network and Hidden Markov Model (LSTM-HMM) to predict China’s Gross Domestic Product (GDP) fluctuation state within a rolling time window. We compare the predictive power of LSTM-HMM with other dynamic forecast systems within different time windows, which involves the Hidden Markov Model (HMM), Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) and LSTM-HMM with an input of monthly Consumer Price Index (CPI) or quarterly CPI within 4-year, 6-year, 8-year and 10-year time window. These forecasting models employed in our empirical analysis share the basic HMM structure but differ in the generation of observable CPI fluctuation states. Our forecasting results suggest that (1) among all the models, LSTM-HMM generally performs better than the other models; (2) the model performance can be improved when model input transforms from quarterly to monthly; (3) among all the time windows, models within 10-year time window have better overall performance; (4) within 10-year time window, the LSTM-HMM, with either quarterly or monthly input, has the best accuracy and consistency.

part about innovation is on page 3. As for the findings of this paper, we give a comprehensive review of results in experiments across time windows and models, which is shown in Table 14 and Fig. 12. We find from these results that among all the time windows, models within 8-year time window have better overall performance in accuracy and consistence and LSTM-HMM with an input of monthly CPI generally has good precision, and within 8-year time window it has the best accuracy and consistency. The two most important reasons are that (1) 8-year window with 32 observations in each round of training is long enough for us to observe all types of GDP fluctuation states and also short enough to avoid the bias caused by too many observations of certain type; (2)LSTM take more effects of historical CPI into the training process and tend to predict a suitable real time CPI fluctuation state, which replaces the lagged observable states adopted by other models, and helps in the prediction of GDP fluctuation using HMM. The revised part is on page 25. All the revised parts are given as follows.
Revised part on page 3: Motivated by the usefulness of HMM and LSTM in economic forecast and the sparsity of literatures on the application of LSTM-HMM in GDP forecast, this paper establishes a LSTM-HMM to predict GDP fluctuation states. We further compare the predictive power of LSTM-HMM with HMM and GMM-HMM using monthly CPI or quarterly CPI within different time windows. There are three innovation points. First, in LSTM-HMM, we innovatively utilise LSTM to predict real-time CPI fluctuation states while feeding this prediction into the forecast of real-time GDP fluctuation states. Second, we select the inflation indicator CPI as the model input and utilise available monthly and quarterly CPI respectively in the LSTM-HMM. Third, we find from the empirical analysis of China's GDP fluctuation states that among all the time windows of rolling prediction, models within 8-year time window have better overall performance in accuracy and consistence and LSTM-HMM with an input of monthly CPI generally has good precision, which has the best accuracy and consistency with an input of monthly CPI within 8-year time window.
Revised part on page 25 and page 26: ......There are several reasons for these model results: firstly, 8-year window with 32 observations in each round of training is long enough for us to observe all types of GDP fluctuation states and also short enough to avoid the bias caused by too many observations of certain type as a proper data source for LSTM-HMM; secondly, LSTM-HMM take more effects of historical CPI into the training process with the application of LSTM and tend to predict a suitable real time CPI fluctuation state, which replaces the lagged observable states adopted by other models, and helps in the prediction of GDP fluctuation using HMM;

Reviewer 1 COMMENT 3:
What is the motivation to introduce a LSTM?
Reply: Thank you very much for the comment. This is a very helpful suggestion. LSTM is a key component of LSTM-HMM, which is characterized with the capability of processing tasks involving long time lags. There are studies on the economic forecast that utilise LSTM (Zhang and Huang , 2021;Zahara et al , 2020), while there are also researches on the application of HMM in economic forecast (Gregoir and Lenglart, 2000;Bellone and Gautier, 2004). However, there isn't any research on GDP forecast that utilises LSTM-HMM, which involves the LSTM's prediction of observable states for HMM. Therefore, this motivates our research that tries to combine the potentials of LSTM and HMM. We have detailed the motivation of introducing LSTM to HMM on page 2 and page 3. The revised parts are as follows: Revised part on page 2: Gross Domestic Product (GDP) is a key indicator of economic growth, which measures the total value of goods and services produced within a country in a year, not including its income from investments in other countries. There are considerable literatures on economic prediction in financial market that utilises Hidden Markov Model (HMM) (Gregoir and Lenglart, 2000;Bellone and Gautier, 2004) and Long Short-Term Memory (LSTM) recurrent neural network (Zhang and Huang , 2021;Zahara et al , 2020). However, there isn't any study on the application of LSTM-HMM in GDP forecast that combines the potentials of these two models, which serves as the motivation of our research on the China's GDP forecast using LSTM-HMM based on the dynamic relationship between inflation and economic growth.
Revised part on page 3: Long Short-Term Memory (LSTM) is a type of recurrent neural network that works better on tasks involving long time lags by bridging huge time lags between relevant input events (Ronald J and Jing, 1990), which has emerged as an effective and scalable model for time-series prediction. LSTM is applied to CPI prediction in Indonesia with multivariate input (Zahara et al , 2020). LSTM is also utilised to the optimal hedging in the presence of market frictions (Zhang and Huang , 2021), which shows usefulness in the empirical analysis of real option markets.
Motivated by the usefulness of HMM and LSTM in economic forecast and the sparsity of literatures on the application of LSTM-HMM in GDP forecast, this paper establishes a LSTM-HMM to predict GDP fluctuation states ......

Reviewer 1 COMMENT 4:
In section 2.1, I found the notation confusing. The authors write that "S is a discrete set (...), where t stands for time.", but I do not see t before that. After this phrase, the authors start to use s t , is this the same capital S defined before?
Thank you very much for the comment. We modified the definition of discrete set S, which is a time series denoted with time t. The revised part is as follows: Revised part on page 4: ...... S = {s 1 , s 2 , ..., s t , ...} is a discrete set of GDP fluctuation states, where t stands for time. ......

Reviewer 1 COMMENT 5:
In Section 3.1, when describing Fig. 5 the authors mention "the curve" and "the straight line". I suppose this is a typo.
Reply: Thank you very much for the comment. To tell the difference of GDP and CPI growth rate, we replace Fig.5 with the new figure, where GDP growth rate is represented with solid line, CPI growth rate is represented with dotted line. The revised part is as follows: Revised part on page 10: (See Fig.1) We firstly describe the trend of annual data of CPI and GDP to have a general understanding about their similarity of growth rate. The trend of annual GDP growth rate and CPI growth rate is shown in Fig.1, where the dotted line represents the annual growth rate of CPI and the solid line represents the annual growth rate of GDP.
Reviewer 1 COMMENT 6: In section 3.2, the Granger does not exactly measure causality, therefore it is better to use the term Granger predictive causality, and clarify this in the text. We conduct a test of Granger predictive causality with x t and y t to obtain the interaction relationship between CPI and GDP in the same period. Considering correlation between two variables that indicates comovement, Granger causality relates to the idea of incremental predictive power of one time series for forecasting another time series, which is a statistically testable criterion based on the ideas of precedence and predictive power (Croux and Reusens, 2013;Yao et al, 2000). ......

Reviewer 1 COMMENT 7:
The results of the article are based on the comparison of numbers without any clear interpretation. Many of the numerical results have four to five decimal digits (e.g. γ). Is this precision really significant? For instance, the authors draw conclusions and compare models with an accuracy of 0.6406 and 0.6563. Is this difference really significant?
Reply: Thank you very much for the comment. We have revised the number of decimal digits as 2 for uniform and significant description, since Bellone and Gautier (2004) set the number of decimal digits as 2 while conducting economic downturn analysis and HMM prediction. The revised parts are as follows: Revised part on page 11: (See Table.1) with high overall significance since its P-value is only less than 0.01. We then test the stability of the residuals e t of the linear model to avoid spurious regression using unit root test, where the P-value with lag order of 4 is below the 5 % significance level. The result of unit root test shows the stability of the time series e t involved in our analysis, and we then conduct a cointegration test using method proposed by (Phillips and Ouliaris, 1990), which shows that under the significance level of 10 percent, we can reject the hypothesis of no cointegration relationship since the statistical value is 29.83, higher than the critical value of 27.43 at right tail.
Revised part on page 12: (See Table 2) As we can see from Table 2, the correlation of x t and y t is 0.88 with a P-value less than 1 %.
Revised part on page 13: The 90% confidence intervals under different scales are also plotted(the dotted line) respectively, so we can read off the asymptotic 90% confidence set [101.33,102.67] from the graph from where test statistics cross the dotted line. whereγ is significant when taking the value of 101.37.
Revised part on page 14:(See Table 3) As we can see from Table 3, the model allows the slope parameters to differ depending on the value of h t . The slope coefficient θ 1 is 0.32 given h t ≤ 101.37, while the slope coefficient θ 2 is 0.62 given h t ≥ 101.37. ......

Reviewer 2 COMMENT 1:
Throughout the text the author describes several properties without specifying what CPI means. Therefore, the author must need to define and make clear what CPI means.
Reply: Thank you very much for the comment. We have supplemented the meaning of CPI in abstract while describing CPI as a key indicator of inflation rate in introduction part. The revised parts are as follows: ...... Among these researches, Consumer Price Index (CPI) has been utilised as the indicator of inflation rate (Sarel, 1996;Hwang and Wu, 2011), while Gross Domestic Product (GDP) is also a key indicator economic growth (Sarel, 1996;Gerlach-Kristen, 2009). ......

Reviewer 2 COMMENT 2:
In the last paragraph of the Conclusion Section the author states that "In practical application, we should try our best to reduce the impact of selection bias.". The author should clarify what he means by "practical application" and what kinds of selection bias is he talking about.
Reply: Thank you very much for the comment. We supplement the explanation about selection bias and also replace practical application with more detailed explanation. The revised parts are as follows: Revised part on page 26: ..... In the empirical analysis of GDP fluctuation states forecast, models encapsulated with HMM tend to ignore the effect of the sparse states and the giant difference of numbers of states will result in selection bias. In this paper, we reduce the impact of selection bias by experiments within pre-selected lengths of time window ranging from 4-year, 6-year, 8-year to 10-year and find that within 8-year window, the model performance is better. ......

Reviewer 2 COMMENT 3:
The author must be revise all figures and tables of the manuscript, with clear description in the respective captions, in order to improve the quality and readability of the manuscript. For instance, there is a typo in Table 13.  Table 4) Lag period Sample size F-value P-value y t is not the cause of x t 5 80 2.578 0.04* x t is not the cause of y t 2 80 9.79 *** Revised part on page 14:(See Table 5)    Table 6 and Table 7) Appropriate-speed growth stage 2 High-speed growth stage x > 101.37 Intensified inflation Figure 5: The structure of the LSTM we build to predict CPI fluctuation states using historical CPI series.
Revised part on page 16:  Revised part on page 19: (See Table 9)  Revised part on page 20:(See Fig.7 and Table 10) Revised part on page 21:(See Fig.8 and Table 11)    Revised part on page 23:(See Fig.10)  Table 13 and Fig.11)   Table 14) is a clear spread in the ROC curves. I think that the author must provide some indication as to why this is so.
Reply: Thank you very much for the comment. We have supplemented description of performance of ROC curves within each time window and tried to provide some indication of the reason. The revised part is as follows: Revised part on page 24 and page 25: For 4-year time window in Fig.11(a), the ROC curves do not vary much. Among the models, LSTM-HMM has better performance with both quarterly and monthly input. LSTM-HMM and HMM performs better with quarterly input, but GMM-HMM performs better with monthly input. For 6-year time window in Fig.11(b), there is a clear spread between ROC curves of HMM(q), LSTM-HMM(q), GMM-HMM(m) and LSTM-HMM(m) and ROC curves of GMM-HMM(q) and HMM(m). This spread may result from model input, as GMM-HMM with monthly input and 6-year time window performs better than GMM-HMM with quarterly input and 6-year time window, while for HMM with 6-year time window, the model with quarterly input performs better. For 8-year time window in Fig.11(c), there is also a clear spread between ROC curves, which is similar with the spread in 6-year time window, resulting from different model inputs. In addition, LSTM-HMM(m) performs best with highest AUC value.
For 10-year time window in Fig.11(d), ROC curves share similar trends, where LSTM-HMM and HMM performs better with quarterly input, but GMM-HMM performs better with monthly